Overview

Dataset statistics

Number of variables10
Number of observations97520
Missing cells152244
Missing cells (%)15.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.1 MiB
Average record size in memory44.0 B

Variable types

Numeric9
Categorical1

Alerts

t_morn is highly correlated with t_noon and 4 other fieldsHigh correlation
t_noon is highly correlated with t_morn and 4 other fieldsHigh correlation
t_evn is highly correlated with t_morn and 4 other fieldsHigh correlation
tmax is highly correlated with t_morn and 4 other fieldsHigh correlation
tmin is highly correlated with t_morn and 4 other fieldsHigh correlation
tmean is highly correlated with t_morn and 4 other fieldsHigh correlation
t_morn is highly correlated with t_noon and 4 other fieldsHigh correlation
t_noon is highly correlated with t_morn and 4 other fieldsHigh correlation
t_evn is highly correlated with t_morn and 4 other fieldsHigh correlation
tmax is highly correlated with t_morn and 4 other fieldsHigh correlation
tmin is highly correlated with t_morn and 4 other fieldsHigh correlation
tmean is highly correlated with t_morn and 4 other fieldsHigh correlation
t_morn is highly correlated with t_noon and 4 other fieldsHigh correlation
t_noon is highly correlated with t_morn and 4 other fieldsHigh correlation
t_evn is highly correlated with t_morn and 4 other fieldsHigh correlation
tmax is highly correlated with t_morn and 4 other fieldsHigh correlation
tmin is highly correlated with t_morn and 4 other fieldsHigh correlation
tmean is highly correlated with t_morn and 4 other fieldsHigh correlation
month is highly correlated with t_morn and 5 other fieldsHigh correlation
t_morn is highly correlated with month and 5 other fieldsHigh correlation
t_noon is highly correlated with month and 5 other fieldsHigh correlation
t_evn is highly correlated with month and 5 other fieldsHigh correlation
tmax is highly correlated with month and 5 other fieldsHigh correlation
tmin is highly correlated with month and 5 other fieldsHigh correlation
tmean is highly correlated with month and 5 other fieldsHigh correlation
t_evn has 1965 (2.0%) missing values Missing
tmax has 37629 (38.6%) missing values Missing
tmin has 37639 (38.6%) missing values Missing
tmean has 74877 (76.8%) missing values Missing
t_morn has 1567 (1.6%) zeros Zeros
t_noon has 1156 (1.2%) zeros Zeros
t_evn has 1387 (1.4%) zeros Zeros
tmin has 1335 (1.4%) zeros Zeros

Reproduction

Analysis started2022-08-09 06:45:17.743609
Analysis finished2022-08-09 06:45:26.120518
Duration8.38 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

month
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.523010664
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size381.1 KiB
2022-08-09T08:45:26.152397image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.448698583
Coefficient of variation (CV)0.5286973701
Kurtosis-1.208030729
Mean6.523010664
Median Absolute Deviation (MAD)3
Skewness-0.009324438533
Sum636124
Variance11.89352192
MonotonicityNot monotonic
2022-08-09T08:45:26.192054image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
18277
8.5%
38277
8.5%
58277
8.5%
78277
8.5%
88277
8.5%
108277
8.5%
128277
8.5%
48010
8.2%
68010
8.2%
98010
8.2%
Other values (2)15551
15.9%
ValueCountFrequency (%)
18277
8.5%
27541
7.7%
38277
8.5%
48010
8.2%
58277
8.5%
68010
8.2%
78277
8.5%
88277
8.5%
98010
8.2%
108277
8.5%
ValueCountFrequency (%)
128277
8.5%
118010
8.2%
108277
8.5%
98010
8.2%
88277
8.5%
78277
8.5%
68010
8.2%
58277
8.5%
48010
8.2%
38277
8.5%

day
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.7293991
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size381.1 KiB
2022-08-09T08:45:26.231766image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.800036562
Coefficient of variation (CV)0.5594642559
Kurtosis-1.194007105
Mean15.7293991
Median Absolute Deviation (MAD)8
Skewness0.006779618683
Sum1533931
Variance77.44064349
MonotonicityNot monotonic
2022-08-09T08:45:26.275435image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
13204
 
3.3%
23204
 
3.3%
283204
 
3.3%
273204
 
3.3%
263204
 
3.3%
253204
 
3.3%
243204
 
3.3%
233204
 
3.3%
223204
 
3.3%
213204
 
3.3%
Other values (21)65480
67.1%
ValueCountFrequency (%)
13204
3.3%
23204
3.3%
33204
3.3%
43204
3.3%
53204
3.3%
63204
3.3%
73204
3.3%
83204
3.3%
93204
3.3%
103204
3.3%
ValueCountFrequency (%)
311869
1.9%
302937
3.0%
293002
3.1%
283204
3.3%
273204
3.3%
263204
3.3%
253204
3.3%
243204
3.3%
233204
3.3%
223204
3.3%

t_morn
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct683
Distinct (%)0.7%
Missing87
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean5.117839027
Minimum-32
Maximum28
Zeros1567
Zeros (%)1.6%
Negative25979
Negative (%)26.6%
Memory size381.1 KiB
2022-08-09T08:45:26.321163image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-32
5-th percentile-9
Q1-0.5
median4.699999809
Q312
95-th percentile18
Maximum28
Range60
Interquartile range (IQR)12.5

Descriptive statistics

Standard deviation8.360211372
Coefficient of variation (CV)1.633543245
Kurtosis-0.3724749684
Mean5.117839027
Median Absolute Deviation (MAD)6.199999809
Skewness-0.2039891332
Sum498646.41
Variance69.89312744
MonotonicityNot monotonic
2022-08-09T08:45:26.365371image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11771
 
1.8%
21623
 
1.7%
01567
 
1.6%
31479
 
1.5%
-11373
 
1.4%
141318
 
1.4%
151298
 
1.3%
101269
 
1.3%
51264
 
1.3%
41263
 
1.3%
Other values (673)83208
85.3%
ValueCountFrequency (%)
-321
 
< 0.1%
-311
 
< 0.1%
-301
 
< 0.1%
-29.200000761
 
< 0.1%
-293
< 0.1%
-27.600000381
 
< 0.1%
-27.51
 
< 0.1%
-273
< 0.1%
-26.52
< 0.1%
-264
< 0.1%
ValueCountFrequency (%)
281
 
< 0.1%
27.51
 
< 0.1%
27.399999624
< 0.1%
27.200000761
 
< 0.1%
273
 
< 0.1%
26.799999241
 
< 0.1%
26.52
 
< 0.1%
26.399999626
< 0.1%
26.200000761
 
< 0.1%
269
< 0.1%

t_noon
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct742
Distinct (%)0.8%
Missing47
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean8.473754577
Minimum-26
Maximum36
Zeros1156
Zeros (%)1.2%
Negative18293
Negative (%)18.8%
Memory size381.1 KiB
2022-08-09T08:45:26.412440image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-26
5-th percentile-6
Q11.200000048
median8
Q316.10000038
95-th percentile23.20000076
Maximum36
Range62
Interquartile range (IQR)14.90000033

Descriptive statistics

Standard deviation9.342841148
Coefficient of variation (CV)1.10256216
Kurtosis-0.7378507853
Mean8.473754577
Median Absolute Deviation (MAD)7.300000191
Skewness-0.008163645864
Sum825962.2799
Variance87.28868866
MonotonicityNot monotonic
2022-08-09T08:45:26.455256image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21428
 
1.5%
11412
 
1.4%
31411
 
1.4%
51246
 
1.3%
41219
 
1.2%
151192
 
1.2%
01156
 
1.2%
161137
 
1.2%
181122
 
1.2%
171092
 
1.1%
Other values (732)85058
87.2%
ValueCountFrequency (%)
-261
 
< 0.1%
-241
 
< 0.1%
-23.799999241
 
< 0.1%
-231
 
< 0.1%
-22.399999621
 
< 0.1%
-223
< 0.1%
-21.52
 
< 0.1%
-21.299999241
 
< 0.1%
-217
< 0.1%
-20.600000381
 
< 0.1%
ValueCountFrequency (%)
361
 
< 0.1%
35.51
 
< 0.1%
351
 
< 0.1%
341
 
< 0.1%
33.799999241
 
< 0.1%
33.599998471
 
< 0.1%
33.52
 
< 0.1%
33.299999241
 
< 0.1%
335
< 0.1%
32.799999242
 
< 0.1%

t_evn
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct682
Distinct (%)0.7%
Missing1965
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean5.970616712
Minimum-27
Maximum30
Zeros1387
Zeros (%)1.4%
Negative22497
Negative (%)23.1%
Memory size381.1 KiB
2022-08-09T08:45:26.500862image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-27
5-th percentile-7.699999809
Q10.1000000015
median5.5
Q313
95-th percentile18.79999924
Maximum30
Range57
Interquartile range (IQR)12.9

Descriptive statistics

Standard deviation8.312794685
Coefficient of variation (CV)1.392284095
Kurtosis-0.4882104993
Mean5.970616712
Median Absolute Deviation (MAD)6.300000191
Skewness-0.1454552114
Sum570522.2799
Variance69.10255432
MonotonicityNot monotonic
2022-08-09T08:45:26.543851image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11639
 
1.7%
21628
 
1.7%
31501
 
1.5%
151431
 
1.5%
01387
 
1.4%
141314
 
1.3%
41280
 
1.3%
101277
 
1.3%
131240
 
1.3%
121236
 
1.3%
Other values (672)81622
83.7%
(Missing)1965
 
2.0%
ValueCountFrequency (%)
-272
< 0.1%
-26.51
 
< 0.1%
-263
< 0.1%
-25.51
 
< 0.1%
-253
< 0.1%
-24.799999241
 
< 0.1%
-244
< 0.1%
-23.799999241
 
< 0.1%
-23.399999621
 
< 0.1%
-23.200000761
 
< 0.1%
ValueCountFrequency (%)
301
< 0.1%
29.899999621
< 0.1%
291
< 0.1%
28.899999621
< 0.1%
28.799999241
< 0.1%
28.600000381
< 0.1%
28.52
< 0.1%
28.399999621
< 0.1%
28.299999241
< 0.1%
28.200000762
< 0.1%

tmax
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct510
Distinct (%)0.9%
Missing37629
Missing (%)38.6%
Infinite0
Infinite (%)0.0%
Mean9.919717489
Minimum-22.10000038
Maximum35.40000153
Zeros702
Zeros (%)0.7%
Negative7608
Negative (%)7.8%
Memory size381.1 KiB
2022-08-09T08:45:26.589441image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-22.10000038
5-th percentile-3.700000048
Q12.799999952
median9.199999809
Q317.5
95-th percentile24
Maximum35.40000153
Range57.50000191
Interquartile range (IQR)14.70000005

Descriptive statistics

Standard deviation8.969659805
Coefficient of variation (CV)0.9042253285
Kurtosis-0.837141633
Mean9.919717489
Median Absolute Deviation (MAD)7.199999809
Skewness0.04738160223
Sum594101.8001
Variance80.45480347
MonotonicityNot monotonic
2022-08-09T08:45:26.637453image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31031
 
1.1%
2996
 
1.0%
4900
 
0.9%
5866
 
0.9%
1820
 
0.8%
20805
 
0.8%
19765
 
0.8%
7750
 
0.8%
6749
 
0.8%
18743
 
0.8%
Other values (500)51466
52.8%
(Missing)37629
38.6%
ValueCountFrequency (%)
-22.100000381
< 0.1%
-19.899999621
< 0.1%
-19.700000762
< 0.1%
-19.100000381
< 0.1%
-191
< 0.1%
-18.899999621
< 0.1%
-18.600000381
< 0.1%
-181
< 0.1%
-17.799999241
< 0.1%
-17.600000381
< 0.1%
ValueCountFrequency (%)
35.400001531
 
< 0.1%
351
 
< 0.1%
34.599998471
 
< 0.1%
34.400001531
 
< 0.1%
34.200000763
< 0.1%
34.099998471
 
< 0.1%
33.299999241
 
< 0.1%
33.200000761
 
< 0.1%
33.099998471
 
< 0.1%
332
< 0.1%

tmin
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct458
Distinct (%)0.8%
Missing37639
Missing (%)38.6%
Infinite0
Infinite (%)0.0%
Mean3.343330941
Minimum-30
Maximum23
Zeros1335
Zeros (%)1.4%
Negative19694
Negative (%)20.2%
Memory size381.1 KiB
2022-08-09T08:45:26.684460image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-30
5-th percentile-10
Q1-1.700000048
median3
Q39.800000191
95-th percentile14.80000019
Maximum23
Range53
Interquartile range (IQR)11.50000024

Descriptive statistics

Standard deviation7.627869606
Coefficient of variation (CV)2.281517965
Kurtosis-0.3593040705
Mean3.343330941
Median Absolute Deviation (MAD)5.699999809
Skewness-0.2866799533
Sum200202.0001
Variance58.18439484
MonotonicityNot monotonic
2022-08-09T08:45:26.732733image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01335
 
1.4%
11103
 
1.1%
-11096
 
1.1%
10978
 
1.0%
-2955
 
1.0%
2950
 
1.0%
11926
 
0.9%
12884
 
0.9%
9843
 
0.9%
5833
 
0.9%
Other values (448)49978
51.2%
(Missing)37639
38.6%
ValueCountFrequency (%)
-301
< 0.1%
-28.200000761
< 0.1%
-26.399999621
< 0.1%
-25.700000761
< 0.1%
-25.51
< 0.1%
-25.399999621
< 0.1%
-25.100000382
< 0.1%
-252
< 0.1%
-24.899999621
< 0.1%
-24.799999241
< 0.1%
ValueCountFrequency (%)
231
 
< 0.1%
22.399999621
 
< 0.1%
221
 
< 0.1%
21.600000381
 
< 0.1%
21.52
< 0.1%
21.200000762
< 0.1%
21.100000381
 
< 0.1%
213
< 0.1%
20.899999621
 
< 0.1%
20.799999243
< 0.1%

tmean
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct446
Distinct (%)2.0%
Missing74877
Missing (%)76.8%
Infinite0
Infinite (%)0.0%
Mean7.298321778
Minimum-23.89999962
Maximum28.29999924
Zeros99
Zeros (%)0.1%
Negative4212
Negative (%)4.3%
Memory size381.1 KiB
2022-08-09T08:45:26.778738image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-23.89999962
5-th percentile-5.599999905
Q11.299999952
median7
Q314.10000038
95-th percentile19.60000038
Maximum28.29999924
Range52.19999886
Interquartile range (IQR)12.80000043

Descriptive statistics

Standard deviation7.99364233
Coefficient of variation (CV)1.095271293
Kurtosis-0.6367722154
Mean7.298321778
Median Absolute Deviation (MAD)6.300000191
Skewness-0.1211112514
Sum165255.9
Variance63.89831543
MonotonicityNot monotonic
2022-08-09T08:45:26.825392image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.899999976142
 
0.1%
2.400000095140
 
0.1%
0.8999999762130
 
0.1%
1.200000048125
 
0.1%
1.5124
 
0.1%
1.700000048123
 
0.1%
3.099999905121
 
0.1%
15.10000038120
 
0.1%
2.299999952118
 
0.1%
2.099999905118
 
0.1%
Other values (436)21382
 
21.9%
(Missing)74877
76.8%
ValueCountFrequency (%)
-23.899999621
 
< 0.1%
-21.799999241
 
< 0.1%
-21.200000761
 
< 0.1%
-20.51
 
< 0.1%
-20.399999621
 
< 0.1%
-19.799999241
 
< 0.1%
-18.899999621
 
< 0.1%
-18.399999621
 
< 0.1%
-17.700000764
< 0.1%
-17.600000381
 
< 0.1%
ValueCountFrequency (%)
28.299999241
< 0.1%
27.600000381
< 0.1%
27.200000761
< 0.1%
27.100000381
< 0.1%
271
< 0.1%
26.700000762
< 0.1%
26.600000381
< 0.1%
26.200000762
< 0.1%
26.100000381
< 0.1%
262
< 0.1%

method
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size762.0 KiB
manual
95694 
automatic
 
1826

Length

Max length9
Median length6
Mean length6.056173093
Min length6

Characters and Unicode

Total characters590598
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmanual
2nd rowmanual
3rd rowmanual
4th rowmanual
5th rowmanual

Common Values

ValueCountFrequency (%)
manual95694
98.1%
automatic1826
 
1.9%

Length

2022-08-09T08:45:26.867822image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-09T08:45:26.907266image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
manual95694
98.1%
automatic1826
 
1.9%

Most occurring characters

ValueCountFrequency (%)
a195040
33.0%
m97520
16.5%
u97520
16.5%
n95694
16.2%
l95694
16.2%
t3652
 
0.6%
o1826
 
0.3%
i1826
 
0.3%
c1826
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter590598
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a195040
33.0%
m97520
16.5%
u97520
16.5%
n95694
16.2%
l95694
16.2%
t3652
 
0.6%
o1826
 
0.3%
i1826
 
0.3%
c1826
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin590598
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a195040
33.0%
m97520
16.5%
u97520
16.5%
n95694
16.2%
l95694
16.2%
t3652
 
0.6%
o1826
 
0.3%
i1826
 
0.3%
c1826
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII590598
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a195040
33.0%
m97520
16.5%
u97520
16.5%
n95694
16.2%
l95694
16.2%
t3652
 
0.6%
o1826
 
0.3%
i1826
 
0.3%
c1826
 
0.3%

year
Real number (ℝ≥0)

Distinct262
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1888.906501
Minimum1756
Maximum2017
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size381.1 KiB
2022-08-09T08:45:26.940956image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1756
5-th percentile1769
Q11822
median1889
Q31956
95-th percentile2009
Maximum2017
Range261
Interquartile range (IQR)134

Descriptive statistics

Standard deviation76.92028478
Coefficient of variation (CV)0.04072212401
Kurtosis-1.208119673
Mean1888.906501
Median Absolute Deviation (MAD)67
Skewness-0.006556520474
Sum184206162
Variance5916.730211
MonotonicityNot monotonic
2022-08-09T08:45:26.989472image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2016732
 
0.8%
2015730
 
0.7%
2014730
 
0.7%
2017730
 
0.7%
2013730
 
0.7%
1756366
 
0.4%
1808366
 
0.4%
1940366
 
0.4%
1788366
 
0.4%
1816366
 
0.4%
Other values (252)92038
94.4%
ValueCountFrequency (%)
1756366
0.4%
1757365
0.4%
1758365
0.4%
1759365
0.4%
1760366
0.4%
1761365
0.4%
1762365
0.4%
1763365
0.4%
1764366
0.4%
1765365
0.4%
ValueCountFrequency (%)
2017730
0.7%
2016732
0.8%
2015730
0.7%
2014730
0.7%
2013730
0.7%
2012366
0.4%
2011365
0.4%
2010365
0.4%
2009365
0.4%
2008366
0.4%

Interactions

2022-08-09T08:45:25.232972image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.596044image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.089271image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.611458image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.106497image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.534546image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.957219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.457789image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.859741image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.279850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.657882image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.137601image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.661188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.155485image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.583764image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.001902image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.502979image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.900224image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.331490image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.732302image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.188230image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.730623image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.205469image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.633777image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.048201image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.548204image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.942112image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.380864image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.805563image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.239728image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.818636image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.255978image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.683013image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.096034image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.594674image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.981152image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.429759image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.863836image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.289269image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.871354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.304771image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.731853image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.142480image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.640916image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.022794image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.475309image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.908768image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.334932image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.918806image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.349523image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.776023image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.277148image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.684800image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.063774image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.522453image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.953110image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.380262image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.964793image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.394513image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.821552image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.321400image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.729016image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.105390image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.566580image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:21.993766image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.513394image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.007568image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.436009image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.862596image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.362078image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.769600image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.144980image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.613123image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.040324image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:22.561562image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.055493image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.484926image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:23.910938image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.408358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:24.814218image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-09T08:45:25.185744image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-08-09T08:45:27.118646image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-09T08:45:27.186659image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-09T08:45:27.246855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-09T08:45:27.306893image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-09T08:45:25.681758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-09T08:45:25.785022image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-08-09T08:45:26.029996image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-08-09T08:45:26.078111image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

monthdayt_mornt_noont_evntmaxtmintmeanmethodyear
011-7.700000-7.5-7.3-7.0-8.800000-7.6manual2010
112-8.300000-10.0-8.8-7.1-11.100000-8.9manual2010
213-11.500000-9.6-7.1-7.1-11.900000-9.4manual2010
314-7.500000-4.5-5.6-3.6-8.300000-6.1manual2010
415-15.200000-13.8-16.9-5.5-16.900000-14.7manual2010
516-18.799999-15.8-13.1-13.1-18.799999-16.0manual2010
617-6.500000-6.4-10.4-5.5-13.300000-8.3manual2010
718-11.300000-10.6-13.8-8.9-13.800000-12.0manual2010
819-14.600000-10.0-11.8-9.2-15.900000-12.6manual2010
9110-9.100000-5.7-5.0-4.8-12.100000-7.1manual2010

Last rows

monthdayt_mornt_noont_evntmaxtmintmeanmethodyear
975101222-3.0-2.0NaNNaNNaNNaNmanual1756
975111223-2.0-1.0NaNNaNNaNNaNmanual1756
975121224-2.0-0.5NaNNaNNaNNaNmanual1756
975131225-1.5-0.5NaNNaNNaNNaNmanual1756
975141226-3.0-5.0NaNNaNNaNNaNmanual1756
975151227-8.0-11.0NaNNaNNaNNaNmanual1756
975161228-13.5-7.0NaNNaNNaNNaNmanual1756
975171229-1.0-3.0NaNNaNNaNNaNmanual1756
975181230-3.0-5.0NaNNaNNaNNaNmanual1756
975191231-3.0-4.0NaNNaNNaNNaNmanual1756